226 research outputs found
Particle approximations of the score and observed information matrix for parameter estimation in state space models with linear computational cost
Poyiadjis et al. (2011) show how particle methods can be used to estimate
both the score and the observed information matrix for state space models.
These methods either suffer from a computational cost that is quadratic in the
number of particles, or produce estimates whose variance increases
quadratically with the amount of data. This paper introduces an alternative
approach for estimating these terms at a computational cost that is linear in
the number of particles. The method is derived using a combination of kernel
density estimation, to avoid the particle degeneracy that causes the
quadratically increasing variance, and Rao-Blackwellisation. Crucially, we show
the method is robust to the choice of bandwidth within the kernel density
estimation, as it has good asymptotic properties regardless of this choice. Our
estimates of the score and observed information matrix can be used within both
online and batch procedures for estimating parameters for state space models.
Empirical results show improved parameter estimates compared to existing
methods at a significantly reduced computational cost. Supplementary materials
including code are available.Comment: Accepted to Journal of Computational and Graphical Statistic
Transport Elliptical Slice Sampling
We propose a new framework for efficiently sampling from complex probability
distributions using a combination of normalizing flows and elliptical slice
sampling (Murray et al., 2010). The central idea is to learn a diffeomorphism,
through normalizing flows, that maps the non-Gaussian structure of the target
distribution to an approximately Gaussian distribution. We then use the
elliptical slice sampler, an efficient and tuning-free Markov chain Monte Carlo
(MCMC) algorithm, to sample from the transformed distribution. The samples are
then pulled back using the inverse normalizing flow, yielding samples that
approximate the stationary target distribution of interest. Our transport
elliptical slice sampler (TESS) is optimized for modern computer architectures,
where its adaptation mechanism utilizes parallel cores to rapidly run multiple
Markov chains for a few iterations. Numerical demonstrations show that TESS
produces Monte Carlo samples from the target distribution with lower
autocorrelation compared to non-transformed samplers, and demonstrates
significant improvements in efficiency when compared to gradient-based
proposals designed for parallel computer architectures, given a flexible enough
diffeomorphism
Resilience Engineering’s Potential For Advanced Air Mobility (AAM)
The national airspace (NAS) will rapidly evolve in the next ten to twenty years. Plans for Advanced Air Mobility (AAM) during that period envision highly automated airspace management systems and electrically powered vehicles. AAM concepts also anticipate limited human roles. The goal of limiting the human role is to minimize the potential for misadventures, yet how the human role is limited needs to be carefully considered in order to also preserve the potential for human successes. The field of resilience engineering (RE) focuses on how systems can change in order to seize an opportunity or withstand an unforeseen challenge. RE methods rely on the use of empirical data to optimize the ability of any system to adapt. RE studies have shown how individual and team initiatives ensure resilient system performance by creating safety through flexibility. Benefits of the RE approach include improved awareness of operational circumstances and how system elements depend on each other, and the ability to allocate limited resources and prepare for surprise. RE offers the ability to account for and incorporate the human role as an essential element in order to ensure NAS systems’ resilient performance. Data on the human contribution to safe and resilient system performance, which is termed “work as done,” are available but are not being considered as the NAS evolves. We present an approach that describes how use of RE can enable the evolving NAS to adapt, and perform, in a resilient manner
Coin Sampling: Gradient-Based Bayesian Inference without Learning Rates
In recent years, particle-based variational inference (ParVI) methods such as
Stein variational gradient descent (SVGD) have grown in popularity as scalable
methods for Bayesian inference. Unfortunately, the properties of such methods
invariably depend on hyperparameters such as the learning rate, which must be
carefully tuned by the practitioner in order to ensure convergence to the
target measure at a suitable rate. In this paper, we introduce a suite of new
particle-based methods for scalable Bayesian inference based on coin betting,
which are entirely learning-rate free. We illustrate the performance of our
approach on a range of numerical examples, including several high-dimensional
models and datasets, demonstrating comparable performance to other ParVI
algorithms with no need to tune a learning rate.Comment: ICML 202
Stein Variational Gaussian Processes
We show how to use Stein variational gradient descent (SVGD) to carry out inference in Gaussian process (GP) models with non-Gaussian likelihoods and large data volumes. Markov chain Monte Carlo (MCMC) is extremely computationally intensive for these situations, but the parametric assumptions required for efficient variational inference (VI) result in incorrect inference when they encounter the multi-modal posterior distributions that are common for such models. SVGD provides a non-parametric alternative to variational inference which is substantially faster than MCMC but unhindered by parametric assumptions. We prove that for GP models with Lipschitz gradients the SVGD algorithm monotonically decreases the Kullback-Leibler divergence from the sampling distribution to the true posterior. Our method is demonstrated on benchmark problems in both regression and classification, and a real air quality example with 11440 spatiotemporal observations, showing substantial performance improvements over MCMC and VI
Efficient and Generalizable Tuning Strategies for Stochastic Gradient MCMC
Stochastic gradient Markov chain Monte Carlo (SGMCMC) is a popular class of
algorithms for scalable Bayesian inference. However, these algorithms include
hyperparameters such as step size or batch size that influence the accuracy of
estimators based on the obtained posterior samples. As a result, these
hyperparameters must be tuned by the practitioner and currently no principled
and automated way to tune them exists. Standard MCMC tuning methods based on
acceptance rates cannot be used for SGMCMC, thus requiring alternative tools
and diagnostics. We propose a novel bandit-based algorithm that tunes the
SGMCMC hyperparameters by minimizing the Stein discrepancy between the true
posterior and its Monte Carlo approximation. We provide theoretical results
supporting this approach and assess various Stein-based discrepancies. We
support our results with experiments on both simulated and real datasets, and
find that this method is practical for a wide range of applications
- …